I-A186  436  5U8L INEAR  UPPER  BOUNDS  FOR  STOCHASTIC  PROGRAMS  NITH  i/1  ^ 

RECOURSE  REVISIONS)  MICHIGAN  UN IV  ANN  ARBOR 
J  R  BIRGE  ET  AL  JUN  87  TR-86-26  N00014-86-K-0628 
UNCLASSIFIED  F/G  12/4  NL 


m 


TO  FILE  copy 


sublinear  Upper  Bounds  for  Stochastic  Programs  with  Recourse 

John  R.  Birge 

Industrial  and  Operations  Engineering  - 
University  of  Michigan 
Ann  Arbor,  Michigan,  USA  48109 

Roger  J-B  Wets 
Mathematics 
University  of  California 
Davis,  California,  USA  95616 

Technical  Report  86-26 


_  n  J  ▼  _  .  . ^  i  nn 


Department  of  Industrial  and 
Operations  Engineering 


DTfC 

ELECTEI 

OCT  2  0  19871 


"jMSTgraU 

Approved  for  public  releaaej  I 
Distribution  UnlirT-iit-i^  I 

The  University  of  Michigan 

College  of  Engineering 

Ann  Arbor,  Michigan  48109-2117 


Sublinear  Upper  Bounds  for  Stochastic  Programs  with  Recourse 


John  R.  Birge 

Industrial  and  Operations  Engineering 
University  of  Michigan 
Ann  Arbor,  Michigan,  USA  48109 


Roger  J-B  Wets 
Mathematics 
University  of  California 
Davis,  California,  USA  95616 


Technical  Report  86-26 
Revised  June  1987 
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University  of  Michigan  University  of  California 
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Abstract:  Separable  sublinear  functions  are  used  to  provide  upper  bounds  on  the  re¬ 
course  function  of  a  stochastic  program.  The  resulting  problem’s  objective  involves  the 
inf-convolution  of  convex  functions.  A  dual  of  this  problem  is  formulated  to  obtain  an 
implementable  procedure  to  calculate  the  bound.  Function  evaluations  for  the  resulting 
convex  program  only  require  a  small  number  of  single  integrations  in  contrast  with  previ¬ 
ous  upper  bounds  that  require  a  number  of  function  evaluations  that  grows  exponentially 
in  the  number  of  random  variables.  The  sublinear  bound  can  often  be  used  when  other 
suggested  upper  bounds  are  intractable.  Computational  results  indicate  that  the  sublinear 
approximation  provides  good,  efficient  bounds  on  the  stochastic  program  objective  value. 

Keywords :  stochastic  programming,  sublinear  function,  simple  recourse  problem,  recourse 
model,  duality,  approximation. 
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1.  Introduction 

We  consider  the  following  stochastic  program  with  recourse: 

find  x  £  !Rni  such  that  Ax  =  6,  Tx  -  x  =  0,  x  >  0,  (1.1) 

and  z  —  cx  +  ®(x)  is  minimized, 

where  'I'(x)  =  E{V'(x>0}  an^  *P{x >  £)  is  the  recourse  function  defined  by 

V»(x,0=  inf  Uv  I  wy  =  S  -  x}-  (1.2) 

ye»+J 

The  random  m2-vector  £  is  defined  on  a  probability  space  (E ,B,P).  The  vectors  6  £ 
SRm‘ ,  c  €  iR"1,  q  £  5R"\  and  matrices,  A  £  SRm‘Xr*‘,  T  6  W  £  5RmjXn3,  are 

deterministic.  In  more  general  models,  q  and  T  may  be  stochastic,  but  we  confine  our 
attention  to  the  case  of  random  right-hand  sides  in  (1.2). 

Difficulties  in  performing  the  multiple  integration  to  evaluate  'P(x)  make  the  solution 
of  (i.l)  especially  complex.  Solution  procedures  that  do  not  assume  special  structure, 
therefore,  involve  some  approximation  to  'I'(x)  and  its  derivatives.  One  type  of  procedure 
is  to  sample  from  £  randomly  and  to  use  sample  information  to  guide  an  optimization 
algorithm.  These  stochastic  quasi-gradient  methods  (see  Ermoliev  (1983)  and  Ermoliev 
and  Gaivoronski  (1987))  have  asympotic  convergence  properties,  but  they  are  limited  by  a 
lack  of  computable  bounds. 

Other  approximation  procedures  for  (1.1)  generally  rely  on  discretizations  of  E  (Huang, 
Ziemba,  and  Ben-Tal  (1977),  Kail  and  Stoyan  (1982),  Birge  and  Wets  (1986a)).  The  lower 
bounds  are  based  on  Jensen’s  inequality  and,  in  general,  require  a  small  number  of  function 
evaluations.  The  upper  bounds,  however,  require  evaluating  r/>(x,  •)  at  all  extreme  points  of 
E  (or  some  region  containing  E),  i.e.,  at  least  2mj  solutions  of  (1.2)  with  varying  £.  These 
approaches  are  then  limited  to  problems  with  small  m2.  The  sublinear  upper  bound  we 
propose  requires  only  solving  0(m2)  linear  programs  (1.2),  so  that  a  much  broader  class  of 
problems  can  be  considered. 

Various  methods  have  been  suggested  for  solving  the  large-scale  linear  program  asso¬ 
ciated  with  a  discretization  of  E  (Wets  (1987)).  These  methods  include  basis  factorization 
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(Strazicky  (1980),  Kail  (1979)),  inner  linearization  (Dantzig  and  Madansky  (1961)),  and 
outer  linearization  (Van  Slyke  and  Wets  (1969)).  Implementations  are  described  in  Kail 
and  Keller  (1983)  and  Birge  (1985,  1987).  In  general,  the  number  of  realizations  in  the  dis¬ 
cretization  is  limited  to  numbers  in  the  hundreds  (i.e. ,  m2  <  10).  This  discretization  level 
is  not  always  enough  to  provide  close  bounds  on  the  objective  value.  Solution  times  can 
also  become  extremely  long.  The  situation  is  better,  however,  if  (1.1)  has  special  structure. 

The  stochastic  program  with  simple  recourse  is  a  special  instance  of  (1.1)  in  which 
W  —  (I,  — I),  where  I  is  an  m2  x  m2  identity  matrix.  This  allows  for  the  use  of  general 
nonlinear  optimization  procedures  (Nazareth  and  Wets  (1986))  and  combinations  of  linear 
and  nonlinear  procedures  (Qi  (1986)).  Separability  reduces  the  m2 -dimensional  integration 
to  m2  independent  one-dimensional  integrals;  it  renders  possible  the  direct  computation 
of  the  values  and  the  derivatives  of  'I'  for  use  in  optimization. 

We  obtain  efficient  upper  bounds  by  replacing  the  recourse  function  with  functions 
similar  to  simple  recourse  functions.  We  generalize  and  extend  the  ray  function  approx¬ 
imation  in  Birge  and  Wets  (1986a)  and  the  efficient  network  implementation  in  Wallace 
(1987).  We  show  that  linear  transformations  of  the  random  vectors  can  be  used  to  obtain  a 
variety  of  separable,  sublinear  bounding  functions.  We  also  give  a  dual-based  solution  pro¬ 
cedure  for  combining  these  bounding  functions.  The  separable  sublinear  functions  allow  for 
monotropic  optimization  (Rockafellar  (1984)).  The  convex  hull  of  several  of  these  functions 
is  used  in  the  calculation  of  the  upper  bound.  In  our  experience,  the  solution  of  this  upper 
bounding  problem  provides  quick  solutions  that  are  close  in  value  to  the  optimum.  It  may 
be  especially  valuable  in  providing  initial  solution  values  (as  in  Birge  and  Wets  (1984))  for 
further  optimization  procedures  (Nazareth  and  Wets  (1986)). 

The  simple  recourse  problem  and  its  properties  are  described  in  Section  2.  Section 
3  describes  the  sublinear  approximation  procedure  and  its  relation  to  the  simple  recourse 
problem.  Section  4  explains  how  the  sublinear  approximation  is  solved  using  its  dual,  and 
Section  5  discusses  our  implementation.  Section  6  presents  computational  results.  Section 
7  discusses  possible  extensions. 
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2.  The  Simple  Recourse  Problem 


The  simple  recourse  problem  has  a  separable  recourse  function  that  is  written  as 
*(x)  =  £  'MxO  where  ^.(x.)  =  /  V’ifXi. p<  is  the  marginal  distribution  func¬ 
tion  of  £,,  which  we  assume  has  bounded  first  and  second  moments,  and 

0,(xv.£.)  =  mf{?.+  y,+  +  i7 y.~  I  y,+  -  vf  =  6  -  x., y,+  >  o,y~  >  o} 

=  /^(6-Xi)  **<£,  (21) 

1  7.  (x.  -  £.)  if  X.  >  6- 

Detailed  properties  of  the  simple  recourse  problem  are  given  in  Wets  (1966,1974a)  and 
Parikh  (1968). 

The  simple  recourse  function  components  'P.(xi)  can  then  be  written  as 


where  <7,  =  q*  +  q~  >0  and  £,  =  / &dP<(£,).  The  functions  \  "^(x)  and  X.  h“+ 

*.(X.)  are  continuous  convex  functions  on  3Cm’  and  SC,  respectively  (Wets  (l974a)).  The 
„ubdifferential  at  x«  *9  given  by 

d'Mx.)  =  W  -  it  +7.P,'(X.)  <  *  <  +?.^.(X.)},  (2-3) 


where  Pt  (x.)  =  lim„rXi  p(y)  (Wets  (1974a)).  From  (2.3),  if  p  is  continuous,  then  is 
differentiable. 

The  development  of  our  sublinear  approximation  relies  upon  the  use  of  conjugate 
functions.  To  find  the  conjugate  of  It,,  first  define 

g.(p)  =  {y  I  P_(y)  <  p  <  P.(y)}-  (2-4) 


=  sup{u,x.  -  *.(x.)}- 

X  . 


( 


) 


The  conjugate  of  If,  is  written 


PROPOSITION  2.1.  The  conjugate  function  of  defined  in  (2.2)  is  given  by 


*‘K)  = 


-9,+  6  +  (v.  +  9,+  )y  -  qtyPi(y) 

+ft/!00  W(&), 

-9^?. 

9.'  & 


if  -9+  <  Vi  <  ?;  , 

if  v,  =  -qf, 
if  =  q-  , 

otherwise, 


where  yeG^). 

Proof.  From  (2.5)  and  (2.2),  we  have 


**  («<)  =  sup{viX.  “  9,+  +  9.+  X.  -  qiXiPiiXi) 


9,  f"  ZidPiUi)} 

J  —  oo 


=  -9,+  +  sup{(v,  +  <?,+  )x.  -  9.X.P.(x.) 


+  9*  r  ZidPi(Zi)}. 

j  —  OO 


From  (2.7),  it  is  clear  that  ^‘(v*)  =  -|-oo  if  t>,  <  -<?t+  or  v,  >  .  We,  therefore,  assume 

that  —qt  <  Vi  <  q~ .  For  v,  €  (-9,^,9,'")»  the  set  of  x.  that  attains  the  supremum  in  (2.7) 
is  (X<  I  PfiXi)  <  <  P<(X<)}  =  C'i('LL^J~)-  Substitution  of  y  €  G,(v-^-)  yields  the 

first  formula  in  (2.6).  For  on  the  boundary  of  [-9,+  ,?“]  and  bounded  distributions,  the 
same  argument  applies.  The  formulas  for  unbounded  distributions  follow  directly  given  our 
assumption  of  bounded  first  and  second  moments.  ■ 


The  following  corollaries  follow  immediately  from  Proposition  2.1. 

COROLLARY  2.2.  If  Pi  is  continuous,  then 

**  (v.)  _  /  -9,+  ?.  +  9.  /! oo  Z.dPi{Zi),  if  -<?,+  <  v,  <  q~  , 
\  +oo  otherwise. 


/here  y  €  G,('LL^L-). 


COROLLARY  2.3.  If  has  the  degenerate  distribution 

p  (£  )  =  /  1  Z>  >  Z,, 

(0  otherwise, 
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*•(„,)  -  f v<t<  if  -?.+  ^  ^  «f .  (2.10) 

(.  +oo  otherwise. 

In  addition  to  'P* ,  we  are  also  interested  in  its  subgradients. 

PROPOSITION  2.4.  The  subdifferential  of 4**  at  e  ri  (dom  4**)  (the  relative  interior 
of  the  effective  domain  of  4'*)  is  given  by 

a*r(Wi)  =  Gi(^i±^).  (2.11) 

?» 

Proof:  The  subgradients  are  found  directly  using  (2.6).  ■ 

3.  The  Sublinear  Upper  Bound 

Simple  recourse  functions  are  used  here  to  approximate  the  general  recourse  function 
function  t/>  as  defined  in  (1.2).  We  assume  that  t p(x,Z)  is  finite  for  all  values  of  x  and 
i.e.,  posW  =  5Rmi.  This  assumption  corresponds  to  complete  recourse  in  stochastic 
programming.  In  practice,  it  can  be  achieved  by  introducing  appropriate  penalties  in  the 
recourse  problem.  In  this  case,  the  function  <f>  defined  by 

<P(Z  -  x)  =  <P(x,  Z)  (3.1) 

is  sublinear  (positively  homogeneous  and  convex).  This  property  allows  the  simple  recourse 
function  approximation.  The  function  <f>  is  also  polyhedral  (i.e.,  its  epigraph  is  a  polyhedral 
cone). 

Birge  and  Wets  (1986a)  introduced  a  method  for  approximating  tp  by  simple  recourse 
functions.  This  method  was  based  on  solving  the  linear  program: 

find  y  E  S?"’  such  that  Wy  =  e,  ,  y  >  0,  (3.2) 

and  qy  is  minimized, 

where  e,  is  the  i th  unit  m2-vector.  The  optimal  solution  value  of  (3.2)  is  g+(i).  If  we 
substitute  -e,  for  e,,  the  optimal  solution  value  is  —  By  sublinearity, 

|  rni 

'P(XyZ)  <  tMX.O  -  </'/(.)  (x.,£,), 


(3.3) 


where 


lh<o(Xi.6)  =  | 


0/(0  (6  ~  X.)>£.  >  Xi 

?/(i)  (x»  —  £«)>£*  ^  x» • 


The  function  tpj  is  a  simple  recourse  function. 
By  integration  in  (3,3),  we  have 


*(X)  <  «/(x), 


(3.4) 


where 

*/(x)  =  Jjr(x,t)dP(Z) 

0/<o(&  "  Xt)dP{£i) 

>X. 

+  I  97{i)(Xi  -  ti)dP{Zi)}. 

Note  that  'i'/  is  separable  in  the  components  of  \  and  only  line  integration  is  required  in 
its  computation. 


Other  approximations  are  obtained  by  considering  directions  other  than  ie*  in  (3.2). 
Let  hx,...,hK  €  5Rm’  positively  span  5Rm*  (i.e-.R’"1  =  pos  [hx , ...,  hK]).  Substitute  h3,j  = 
for  e{  in  (3.2)  and  let  the  optimal  solution  values  be  qmj),j  =  1  We  then 

have 


VHX.O  <  in 


K  K 

!  X* =  £  _  X, >  0 ,j  =  1  >  ■■•>#}■ 

}=l  1=1 


(3.5) 


If  H  =  [hx,  includes  all  columns  of  W,  then  (3.5)  becomes  an  equality  (see  Birge 

and  Wets  (1986b)). 

A  difficulty  in  using  (3.5)  for  approximating  ^  is  that  in  general  an  optimization  must 
still  be  performed  inside  the  integral.  This  solution  is  immediate,  however,  if  H  is  a  positive 
linear  basis  for  5Rmj ,  i.e.,  every  point  in  corresponds  to  a  unique  positive  combination 
of  the  hi.  A  convenient  choice  for  such  a  set  is  to  use  a  linear  basis  D  =  ,  ...,dm, 

for  5Rmj  and  —D  =  [  —  dx , ...,  —  dm,],  so  that  pos  [D,-D]  =  9?m’.  In  this  case,  £  —  x  = 
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where  (D-1),.  indicates  the  ith  row  of  D  1 .  The  approximation  is 

V>(x»0  <  iMx.O  =  X3^(*)(x,0  (3-6) 


where 


.  ,  if(D-1).-.(f-x)>0, 

^(0 (x’ e)  ■  \  q-D[i){D'lV (x  -  0  if  (D- 1 ),. (X  -  0  >  0, 


and  q£{i)  and  qD[i)  are  the  optimal  solution  values  of  (3.2)  with  d,  and  -dit  respectively, 
substituted  for  e,  and  — e,. 

Several  different  bases  are  used  in  the  sublinear  approximation  of  ip.  For  D  — 
{  D1 , . . . ,  DL  } ,  a  set  of  linear  bases  for  SRm  1 , 


V'tx,  0  <  inf{^AJipD,(x,  £’)  |  >  0,  for  all  j}, 


=  co  (t PD,{xr),j  = 


the  function  obtained  by  taking  the  convex  hull  of  the  epigraphs  of  ipD,,j  =  1  Of 

course,  if  D  includes  all  linear  bases  in  W,  then  (3.8)  is  satisfied  as  an  equality.  Although  it 
might  be  simpler  than  (3.5),  inequality  (3.8)  is  still  difficult  to  use  computationally,  again 
because  of  the  minimization  required  inside  the  integral.  A  weaker,  but  usable  inequality 
is  obtained  by  reversing  the  infimum  and  integration.  For  this,  we  define 


* 

'Mx)  =  J2*D[i){x), 


where 


*£>,q(x)=  f  ^„)(0”),.((-x)^(  0 

■/(/?- 1 ).  ,(«-x)>o 

+  (  iwiD-'Mx-QdPiS), 


PROPOSITION  3.1.  Let  D  be  a  set  of  linear  bases  of  3?m’ ,  then 


'f'(x)  <  co  {tyD,D  e  £>}(x)- 


(3.10) 


Proof:  Integration  of  (3.6)  yields 


^(x)<^o(x)  (3.11) 

for  any  D  £  P.  So,  YITJi  ^’^(x‘)  <  ^‘^d(x‘)»  for  any  A*  >  0.  Letting  A*  =  1 

and  x  =  YITJl  A*X*  yie^s  (3.10)  by  the  convexity  of  <lM 

Equality  in  (3.10)  can  only  be  guaranteed  in  very  special  cases,  even  if  V  includes  all 
linear  bases  from  W .  To  see  this  simply  observe  that  if  D  is  rich  enough,  then 


/ 


co  VD  — 


/ 


inf 

D 


>  inf 

D 


/ 


Vp 


with  strict  inequality  except  in  degenerate  cases  such  as:  D  is  a  singleton,  the  'I'p’s  are 
linear,  the  probability  measure  is  degenerate,  etc.  But  as  suggested  by  Proposition  3.1, 
co  {^d,D  £  P}(x)  always  provides  us  with  an  upper  bound,  that  is  relatively  easy  to 
compute,  even  when  other  procedures  require  more  computations  than  can  be  implemented 
in  realistic  times  (i.e.,  the  solutions  of  more  than  1000  linear  programs  for  a  single  bound 
on  problems  with  more  than  10  random  variables).  We  have  also  observed  that  the  solution 
obtained  from  solving  the  “stochastic”  program 


find  x  £  5R"1  ,X  £  such  that  Ax  =  b,Tx  —  x,x  >  0, 

and  z  =  cx  +  co  ,  D  £  P}(x)  is  minimized, 

instead  of  (1.1),  is  usually  a  very  good  approximation  of  the  optimal  solution,  much  better 
than  may  be  expected  from  the  relatively  lax  inequality  (3.10).  Our  experience  shows  that 
the  function  co  {^d,D  £  D)  is  “parallel”  to  'f,  i.e., 

dvKx)  »  d  co  {*d,D  £  P}(x).  (3.12) 


In  the  Appendix,  we  provide  a  heuristic  argument  and  small  example. 


4.  Dualization  and  Solution  Procedures 

Several  questions  must  be  answered  in  order  to  solve  (1.1)  with  co{^D,  D  £  P}  sub¬ 
stituted  for  The  first  concern  is  that  finding  the  convex  hull  of  a  set  of  functions  is  itself 
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a  difficult  task.  The  second  problem  is  to  evaluate  i/D  efficiently.  A  third  area  involves  the 
choice  of  P.  We  address  each  of  these  problems  in  this  section.  After  substituting  for  'I', 
(1.1)  becomes: 

find  i€S"‘,xe  9?m’  such  that  Ax  —  b,Tx  -  x  =  ®>x  >  0,  (4.1) 


and  z  =  cx  +  co  {’ifp  ,D  £  P}{x)  's  minimized. 

Instead  of  solving  (4.1),  we  consider  a  dual  program  to  (4.1).  The  dual  program  has  a 
computational  advantage  because  the  convex  hull  operation  is  replaced  by  a  supremum. 

PROPOSITION  4.1.  A  dual  program  to  (4.1)  is  given  by 

find  a  £  9im  ‘ ,  n  £  SR"1’  such  that  a  A  +  nT  <  c  (4-2) 


and  w  =  ab  -  (sup  ^ )(-tt)  is  maximized, 

DeD 

where  is  the  conjugate  function  of  and  where  the  optimal  value  of  (4.2),  u>*  =  z‘ , 
the  optimal  value  of  (4.1). 

Proof:  A  general  dual  of  (4.1)  (see,  e.g.,  Rockafellar  (1974),  Geoffrion  (1971))  can  be  written 
as 

max{  inf  cx  +  g(x)  +  cr(6  -  Ax)  +  n(x  ~  Tx)}  (4.3) 

n  ,<r  x  >  0,  x 

where  <?(x)  =  co  {'Po  |  D  e  P}(x)-  Since  (4.1)  involves  linear  constraints,  the  optimal 
values  of  (4.3)  and  (4.1)  are  the  same.  We  can  rewrite  (4.3)  as 

max{  inf  (c  -  crA  -  nT)x  -  (-nx  -  g{x))  +  vb},  (4.4) 

n,(T  i  > 0,  x 

which  is  equivalent  to 

find  a  £  0?m  1 ,  it  £  SR"1’  such  that  o A  +  nT  <  c  (4-5) 


and  w  —  ab  -  <?*(  — tt)  is  maximized. 
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By  Theorem  16.5  of  Rockafellar  (1970) 


(  CO  {*D  I  D  e  D}(x)Y  =  sxip{<trD  I  D  e  D},  (4.6) 


yielding  (4.2).  I 

A  solution  of  the  dual  program  (4.2)  is  simpler  to  compute  than  a  solution  of  (4.1) 
because  g'  (  —  ir)  is  much  easier  to  evaluate  than  y(x).  We  must,  however,  obtain  an  expres¬ 
sion  for  ^'D.  First,  let  Po(i)  be  the  distribution  function  of  g,  =  (D-1),  (£  -  ^)  and  let 
li  =  J  Also  define 


gdh)(p)  =  { y  |  PD(i)(  y)  <p<  PD[i)(y)}- 


PROPOSITION  4.2.  The  conjugate  function  of  'I'p  is  given  by 

m  2 

=  M»  (4-7) 

«  =  i 

where 


(V) 


-lot, +  (vD  <  +iDd))v-  9D(i>yPD(,)(y) 
$,dPD  ft]  (f<), 

?D(  i|£. 

.  +00 


i^  ~  ?o(«)  <  •  <  9 DU)’ 

if  vD  %  —  —  (,• ) , 

if  vD  i  =  qD  (t  j , 
otherwise, 


where  qo{< |  —  ?£>(,]  ~b 


and  y  e  GD^)(- 


(4.8) 


Proof :  From  (3.9),  observe  that 


*D(,|(x)  =  J  W  {{D  1x)>,$,)dPDa){u),  (4.9) 

where  is  a  simple  recourse  function  as  in  (2.1)  with  and  g^(i|  substituted  for  qf 

and  q~  respectively.  From  (4.9),  it  follows  that 

*d(X)  =J2^((D~1X),)  =  *'(D~lx),  (4.10) 
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where  '!'*  has  the  form  of  the  simple  recourse  function  in  (2.2)  and  '!'*  is  the  simple  recourse 
function  defined  for  the  random  variables  u,  in  place  of  £1,. 

The  dual  of  'I'd  is  then  (see  Theorem  16.3  Rockafellar  (1970)) 

*'D  =(VD-ly  =  D-'i' t’Y,  (4.11) 

where 

D“1('I',)*(u)  =  (¥*)*(«£).  (4.12) 

Applying  (4.11)  and  (4.12)  to  (2.6)  yields  the  result.  ■ 

A  subgradient  of  'i/’D  is  used  in  the  optimization  procedure.  It  is  also  calculated  from 
the  subdifferential  of  ('I'*)*. 

PROPOSITION  4.3.  The  subdifferential  of  at  v  £  ri  (dom  \t'^))  is 

cK**d)H  =  {Dy\y€Xm\yieGD{i)(VD'  +-D{i)),i=l,...,m2}.  (4.13) 

7d(.) 

Proof:  If  w'  is  a  subgradient  of  (4'*)*  at  w,  then 

(*  -  w)  •  w*  +  (4>*)*(u;)  <  (^')*(z).  (4.14) 

From  (4.11)  and  (4.12)  if  z  =  uD  and  w  =  vD,  (4.14)  becomes 

(v-«)Du;*  +  ^(t;)<^(u).  (4.15) 

From  Proposition  2.4,  te*  =  (wf,..., w^})  where  w‘  S  G,("°  ’^Dq' — ),  so  Dy  in  (4.13)  is  a 
subgradient  of  'I'J,  at  v.  A  reverse  argument  shows  that  any  subgradient  of  'i'D  has  the 
form  Dy  in  (4.12),  proving  the  result.  ■ 

The  expressions  for  and  in  (4.7-8)  and  (4.13)  are  used  in  an  optimization 

procedure  for  (4.2).  A  difficulty  is  that,  even  when  each  is  differentiable  (i.e.,  the  dis¬ 
tribution  function  of  is  strictly  increasing  on  its  support,  but  not  necessarily  continuous), 
the  objective  function  in  (4.2)  is  not  necessarily  differentiable.  Nondifferentiable  methods 
(see,  e.g.,  Lemarechal  (1978),  Lemarechal  et  al.  (1981),  Nazareth  and  Wets  (1986),  Polak 
(1987),  Wolfe  (1975))  can  be  applied  to  this  program.  We,  however,  transform  (4.2)  into 


V  V  V  "J  V  -J. ~  ~.v. 
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a  smooth  optimization  problem  with  nonlinear  constraints  (as  suggested,  for  example,  in 
Gill,  Murray,  and  Wright  (1981)).  The  new  problem  becomes 


find  <7  €  !Rmi 

,  7T  €  !Rmj ,  0  6  3?  such  that 

(4.16) 

a  A  +jtT 

<  c 

-9  <  0,  for  all  D  £  D ,  and 

w  =  ab 

- 9  is  maximized. 

General  methods  for  optimization  problems  with  nonlinear  constraints  can  be  applied  to 
(4.16).  In  our  examples,  we  use  the  MINOS  computer  code  (Murtagh  and  Saunders  (1980)) 
to  solve  (4.16).  Note  that  (4.16)  is  similar  in  form  to  a  master  problem  in  a  linear  outer 
approximation  algorithm  (e.g.,  the  L-shaped  method),  but  we  solve  the  dual  and  use  a 
nonlinear  outer  approximation. 

5.  Implementation  Considerations 

In  solving  (4.2)  or  (4.16)  one  needs  to  find  Po(i),  the  distribution  function  of  f,, 
and  GD(i),  the  inverse  function.  If  the  random  variables  are  independently,  normally 
distributed  with  means,  /*,  ,  and  variances,  a2- ,  then  &  =  (D-1)*  (£  -  \)  is  also  normally 
distributed  with  mean  (D~  *)o  ~  Xj)  and  variance  .  (Note  that 

degenerate  and  correlated  random  variables  can  also  be  included  among  the  £,  with  f, 
remaining  normally  distributed.)  Other  distributions  require  special  schemes  in  order  to 
integrate  with  respect  to  $*.  In  our  experiments,  we  used  normally  distributed  random 
variables  because  of  the  ease  in  performing  these  computations.  Upper  bounds  can  be 
obtained  for  other  distributions  by  using  the  approaches  in  Birge  and  Wets  (1986a,  1987). 

Given  a  problem  form  ((4.2)  or  (4.16))  and  a  method  for  finding  Pd[\),  the  set  of  bases 
must  still  be  chosen.  For  the  problem  of  approximating  it  appears  that  the  matrices  to 
include  in  D  should  be  chosen  so  that  the  level  sets  of  rpD  cover  high  probability  regions 
of  the  level  sets  of  t />.  This  coverage,  however,  depends  on  x>  so  the  choices  should  be 
good  for  a  range  of  values  of  x  (that  would  ideally  include  the  optimal  value  of  x)-  In  our 
experience,  the  identity  provided  a  good  starting  basis  ,  especially  when  the  optimal  x  was 
close  to  £. 
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We  implemented  several  basis  generation  procedures  that  started  with  the  identity  as 
the  first  basis  and  then  included  additional  bases.  Generating  random  bases  from  the  set  of 
all  bases  proved  inefficient  because  the  corresponding  functions  'i/‘D  often  did  not  improve 
the  solution.  Bases  were  then  generated  from  the  set  of  optimal  bases  for  some  £  by  solving 
(1.2)  for  varying  values  of  £  and  A  patterned  choices  of  values  for  £  (using  £,  ±  3a,  for 
each  »)  proved  slightly  more  efficient  than  random  selection  of  £,  and  was,  therefore,  used 
in  the  experiments  described  below.  Two  different  choices  for  x  were  implemented.  On 
the  A:th  solution  of  (4.16)  with  new  bases  added  to  P,  we  used  x  =  X*-1  or  —  X*-1  where 
X*-1  was  the  optimal  value  of  x  from  the  ( k  -  l)th  solution  of  (4.16).  In  our  experiments, 
X  =  —  X*-1  proved  more  effective  because  it  included  a  broader  class  of  bases  by  exploring 
different  regions  of  x- 

The  sublinear  upper  bounding  method  with  this  implementation  is  given  below.  The 
basic  tolerance  parameters  are  TOLINF  for  infinity  and  TOLCONV  for  convergence. 

Sublinear  Upper  Bounding  Method 

Step  0.  Initialization.  Let  Dl  =  I,  K  =  1,  P={Dl ,  •  •  • ,  DK  },  J  =  0,  wolJ  =  TOLINF , 

/  =  0,  x  =  o. 

Step  1.  Add  New  Basis.  Let  DK  =  [DK  ( 1), . . . ,  DK  (m2)]. 

a.  For  i  =  1, . . .  ,m2,  if  DK  {%)  £  J,  let  J  =  J  u  DK  (i)  and  find  r 

b.  For  »  =  1, . . . ,  m2,  calculate  parameters  for  PD^  > . 

If  K  =  1,  go  to  3. 

Step  2.  Search  for  New  Basis.  For  »  =  l,...,m2,  /  =  WO  +  1)3*~ 1 ,  let  h(i)  = 

?,  +  3k(i)oi.  Find  «/<(x,h)  with  an  optimal  basis,  D' . 

If  D'  i  P,  DK  +  l  =  D\  K  =  K  +  1,  go  to  1. 

Else,  if  l  <  3mj,  1  =  1+  1,  repeat. 

Else,  go  to  3. 

Step  S.  Find  New  Bound.  Solve  (4.16)  to  obtain  wnrw  and  let  the  dual  variables  associated 
with  the  linear  constraints  be  x.  Let  x  =  Tx. 


If  TOLCONV  <  ( wold  -  Unew)  (/w„eiu  if  Wnew  ■£  0),  go  to  2. 


Else,  stop,  wnem  is  the  sublinear  upper  bound. 

6.  Numerical  Results 

Formulation  4.16  was  used  with  the  MINOS/AUGMENTED  (MINOS  Version  4.0) 
computer  program  for  nonlinearly  constrained  problems.  This  implementation  on  The 
University  of  Michigan’s  Amdahl  5860  Computer  found  optimal  solutions  for  (4.16)  for  all 
of  the  test  problems  tried. 

The  appropriate  use  of  tolerances  on  constraint  satisfaction  was  especially  important 
in  our  implementation.  A  value  of  10-5  was  used  for  ROW  TOLERANCE  in  the  MINOS 
SPECS  file.  This  allowed  some  flexibility  in  satisfying  the  constraints  without  creating 
large  infeasibilities.  To  avoid  infinite  values  of  ,  the  constraints 

Qd (»)  ^  v&i  5:  ?D(i)>  (5-1) 

were  added  to  (4.16).  A  penalty  term  was  included  in  the  subgradient  definition  from  (4.13) 
when  (5.1)  was  satisfied  as  an  equality  (within  the  tolerance).  This  made  the  subgradient 
definition  consistent  at  the  boundary  of  the  effective  domain  of  • 

Problem  testing  initially  involved  simple  recourse  problems  which  were  solved  exactly 
by  the  formulation  in  (4.16).  After  conducting  this  check,  the  method  was  applied  to 
general  recourse  problems.  The  results  reported  here  apply  to  the  small  energy  decision 
problem  in  Louveaux  (1987).  In  this  problem,  =  2,  nx  =  4,  m2  =  7,  and  n2  =  12. 
Of  the  seven  recourse  problem  constraints,  four  are  balancing  constraints  that  are  fixed  at 
zero  and  three  are  demand  constraints  that  are  stochastic.  The  demands  were  assumed 
independent  and  normally  distributed.  Different  problems  were  generated  by  varying  the 
means  and  standard  deviations  of  these  random  variables  (see  Table  1). 

To  check  the  accuracy  of  the  sublinear  approximation  method,  the  test  problems  were 
also  solved  using  the  L-shaped  code,  NDSP(Birge  (1985)),  with  upper  bounds  from  the 
Edmundson-Madansky  inequality  (Madansky  (1959))  and  lower  bounds  from  Jensen’s  in¬ 
equality.  After  the  fcth  solution  of  the  problem  with  these  bounds,  the  bounds  were  refined 


for  each  £,  until  NDSP’s  limit  of  125  realizations  of  the  random  vector  £  was  reached.  This 
occurs  here  at  k  =  4,  but  note  that  the  Edmundson-Madansky  bound  in  NDSP  cannot  be 
applied  even  once  to  provide  an  upper  bound  for  problems  with  more  than  seven  random 
variables.  The  sublinear  upper  bound  does  not  have  this  limitation. 

The  results  for  the  upper  bounds  and  CPU  second  times  for  each  iteration  appear  in 
Table  2  under  the  NDSP  columns.  The  upper  and  lower  bounds  on  the  final  iteration  value 
appear  in  Table  3.  Note  that  in  Problem  8  after  33.80  CPU  seconds,  NDSP  terminated 
because  a  limit  of  100  L-shaped  algorithm  iterations  (corresponding  to  100  cutting  planes) 
had  been  performed.  Execution  was  not  continued  because  of  large  row  residuals  caused 
by  instability  in  the  bases  including  these  cuts  (see  Birge  (1986)  for  a  discussion  of  this 
phenomenon). 

The  sublinear  upper  bounding  approximation  results  appear  in  Tables  2  and  3.  As 
given  in  Section  5,  the  algorithm  is  terminated  when  the  addition  of  new  bases  does  not 
change  the  solution.  This  is  indicated  by  “b”  in  Table  2.  Further  improvement  would  be 
possible  using  groups  of  random  variables  as  noted  in  the  next  section.  The  upper  bound 
result  of  solving  (4.16)  and  the  CPU  seconds  to  obtain  this  result  appear  in  Table  2.  The 
computation  times  include  some  iteration  logging  that  is  comparable  but  not  identical  to 
iteration  log  times  included  in  the  CPU  times  for  NDSP. 

Table  3  provides  bounds  on  the  objective  values  Z[NDSP )  =  cxND  +  ^(xatd)  and 
Z(SL)  =  cxSL  +  't'lxsrj,  where  {xNp,  Xn d)  is  the  last  solution  found  by  NDSP  and 
(xsl  ,  Xsl)  is  the  last  solution  found  by  the  sublinear  upper  bounding  (SL)  method.  Upper 
bounds  are  found  using  the  Edmundson-Madansky  inequality,  and  lower  bounds  are  found 
using  the  Jensen  inequality.  The  relative  differences  between  the  means  of  the  upper  and 
lower  bounds  given  by  NDSP  and  the  sublinear  method  are  also  given  in  Table  3. 

From  Table  2,  it  appears  that  the  sublinear  method  provides  upper  bounds  that  are 
close  to  the  NDSP  values  within  times  that  are  comparable  with  the  NDSP  times.  Table 
3  indicates  how  well  the  x  and  x  solutions  from  the  sublinear  method  perform  in  (1.1). 
The  upper  and  lower  bounds  result  from  upper  and  lower  approximations  of  'I'  In  Table 
3,  note  that  the  mean  values  of  the  upper  and  lower  bounds  on  Z(SL)  are  at  most  5.8^ 


A  key  property  to  note  in  examining  Table  2  is  that  the  sublinear  method  times  are 
comparable  for  all  problems,  but  the  accuracy  relative  to  the  NDSP  bounds  does  not 
degrade  as  problem  size  increases.  The  NDSP  approximations,  however,  become  much 
worse  as  additional  random  variables  are  included  into  the  problem.  The  true  advantage 
of  the  sublinear  method  is,  therefore,  in  larger  problems,  as  is  evident  in  Problems  7-9. 
Here,  the  sublinear  method  obtains  relatively  good  bounds  (within  25%  of  the  best  lower 
bound)  in  fractions  of  the  times  for  NDSP.  These  experiments  were  limited  to  problems 
where  an  Edmundson-Madansky  could  be  obtained.  Again,  it  should  be  emphasized  that 
the  sublinear  upper  bound  can  be  applied  to  problems  where  the  Edmundson-Madansky 


Table  3.  Upper  and  Lower  Bounds  on  Last  Solution  Value 


Problem 

NDSP 

Sublinear  Approximation 

Upper 

Lower 

Upper 

Lower 

Mean 

Bound 

Bound 

Bound 

Bound 

Difference* 

1 

366. 

363. 

371. 

370. 

.016 

2 

335. 

334. 

343. 

343. 

.026 

3 

289. 

289. 

289. 

289. 

.000 

4 

424. 

419. 

438. 

436. 

.037 

5 

377. 

374. 

390. 

389. 

.037 

6 

347. 

345. 

350. 

349. 

.010 

7 

466. 

438. 

494. 

462. 

.058 

8 

491. 

316. 

369. 

337. 

-.125 

9 

471. 

433. 

486. 

465. 

.052 

*  —  (( UB{SL )  +  LB(SL))/ 2  -  ( UB{NDSP )  +  LB(NDSP))/2)/{{UB{NDSP) 
+  LB(NDSP))/ 2) 
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7.  Discussion  and  Extensions 

The  results  in  Section  6  indicate  that  the  sublinear  upper  bounding  method  is  a  good 
upper  bounding,  approximation  procedure  for  stochastic  programs,  especially  when  other 
bounds  are  not  computationally  tractable.  The  sublinear  method  provides  good  bounds 
on  the  optimal  objective  value  quickly.  The  solutions  obtained  from  it  are  also  generally 
close  in  value  to  the  solutions  obtained  by  other  solution  procedures.  In  addition,  the 
sublinear  method  can  be  applied  to  problems  with  many  random  variables  where  other 
methods  cannot  be  applied.  This  ability  for  general  stochastic  programs  sets  the  sublinear 
method  apart  from  all  other  methods.  The  sublinear  bound,  however,  only  majorizes  the 
optimal  objective  value.  Lower  bounds,  obtained  from  the  Jensen  inequality  or  other  outer 
linearization  approach  (see  Marti  [1975]  and  Birge  and  Wets  [1986a])  should  be  used  in  a 
full  optimization  procedure. 

A  convergent  procedure  for  bounding  the  recourse  function  value  can  be  obtained  by 
evaluating  the  recourse  function  with  respect  to  groups  of  random  variables.  As  the  number 
of  random  variable  i::  these  groups  increases  the  bound  is  tightened  until,  if  all  random 
variables  are  included  in  a  single  integration,  the  true  objective  value  is  obtained.  This 
procedure  is,  of  course,  only  useful  as  long  as  the  integrals  are  simple  enough  to  allow  easy 
calculation.  For  example,  suppose  m2  =  3, 


=  min {qy  \  Wy  =  (£t  -  Xi>6j  -  X2,0)r,y  >  0} 


=  *ifl  +  *2?2 


(7.1) 


where  (ft ,  ft)  =  (£i  -  Xi  >  £2  -  X2)  €  /?  ,«'=  1, ...,  r,  and  we  can  evaluate 


^2(Xl.X2 


(7.2) 


We  then  combine  the  approximation  in  (7.2)  with  a  simple  recourse  function  approximation, 
^ 3 ( X 3 )  1  evaluated  with  respect  to  ft  =  £3  -  *3-  The  result  is  another  upper  bound  on 
where 


*(x)  <  *2(Xl.X2)  +  *3(X3)- 


(7.3) 
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Again,  convex  combinations  of  approximations  of  the  form  (7.3)  for  every  pair  of  random 
variables  can  be  used  to  approximate  from  above.  The  practicality  of  this  approach 
of  course  relies  upon  the  possibility  to  evaluate  integrals  as  in  (7.2)  efficiently.  For  large 
groups  of  random  variables,  this  effort  may  negate  any  advantages  from  this  approximation 
scheme. 

Appendix 

The  following  heuristic  argument  and  small  example  give  some  justification  for  the 
observation  in  (3.12).  Suppose  (i0,x0)  is  a  feasible  solution  of  (1.1)  that  is  near  the 
optimal  solution.  Suppose  D  contains  all  bases  (in  W")  that  correspond  to  basic  optimal 
solutions,  for  any  possible  pair  (£,x),  °f  the  linear  program  that  defines  £),  cf-  Wets 
(1974b)  for  the  Basis  Decomposition  Theorem.  Also,  suppose  that  for  some  D0  E  D , 

co  {«d,Z)6P}(x0)  =  ^o(x0).  (A.l) 

(A  convex  combination  of  values  would,  in  general,  appear  above.  We  consider  a  single 
function  to  simplify  the  argument.)  Given  (A.l) 

d{  co  *D}(x0)  =  d¥o0(x°) 

=  /  drPDo(X°,Z)dP(0. 

Observe  that  dipDo(x°,Z)  =  -q£o  if  £  €  E0)  where 

-o  =  { £  I  Do  1  £  >  Dq  1  X°  }  • 

Since  'iDo  gives  the  convex  hull  operation  value,  the  region  of  E  in  which  tyDn  is  exact  should 
be  large.  One  can  then  reasonably  expect  E0,  or  more  generally  the  region  surrounding  E,> . 
to  contain  most  of  the  probability  mass  of  E.  Hence  the  subgradients  of  d'F Dn  at  x°  are 
approximated  by  On  the  other  hand  (Section  7,  Wets  (1974b)),  excluding  possibly 

some  boundary  cases, 

a*(x°)  =  J  W(x°,  £)<*/>(£) 

=  Zi-*o)plD-'<i>D-^]  {A~] 

dgp 
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where  irD  are  the  multipliers  associated  with  the  basis  D.  Again,  since  H0  and  its  neighbors 
occupy  much  of  the  probability  mass  of  5,  d'P(x°)  is  reasonably  well  approximated  by 
—  nDa.  From  construction,  q^o  =  Xd0-  Hence,  both  d  co  {’I'd}  and  d’l'  at  x°  are  near 

~*D0- 

As  an  example,  consider  the  stochastic  program: 

find  z,  y^ ,  y~  ,  ,  y2  ,  y3  >  0  such  that  (A. 3) 


t  ix  +y i+  -yf 
t2x 


<  1 

+y3  =  £1 


t2x  +y^  -y2  +y3  =  6 

to  minimise  z  —  cx  +  Ef  [y+  +  yf  +  y2  +  y2  +  y3]. 

where  £,  is  uniformly  distributed  on  [0,1],  for  i  —  1,2.  The  optimal  basis  (or  the  related 
basis  with  positive  coordinates)  is  indicated  for  each  region  of  [0,  l]  x  [0, 1]  in  Figure  1.  The 
three  bases  i.;e 

*  =  (!  !)•*  =  (!  :)•*-(!  !)■ 

We  use  D1 , ,  D2 ,  D3  to  develop  approximating  functions  1  >  ^d  ’  and  ’I'd  » 


- 

BASIS  REGIONS 

o’ 

o! 

- 

0* 

D’ 

i  •  i 

o’ 

A  ■»  — ■ - K  1 — lL  -  -  -i- 
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Figure  1.  Optimal  Subproblem  Basis  Regions 

A  comparison  of  the  three  approximations  and  the  exact  value  of  <l»  is  given  in  Figure 
2,  where  X2  =0.1.  Note  that  the  slope  of  the  convex  hull  of  the  approximating  functions 
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closely  approximates  the  slope  of  vf*.  At  low  values  of  xi,  the  D3  and  D2  optimal  region 
occupy  most  of  E  and  hence,  convex  combinations  of  their  subgradients  provide  good  ap¬ 
proximations  of  At  higher  values  of  Xi ,  the  optimal  regions  for  D3  and  D2  diminish, 
so  that  provides  a  good  approximation  of  <3^. 

£«ACt  (Sot.®  LINC) 


Figure  2.  Comparison  of  \b  Values. 
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